Lost in Translation: Authorship Attribution using Frame Semantics
نویسندگان
چکیده
We investigate authorship attribution using classifiers based on frame semantics. The purpose is to discover whether adding semantic information to lexical and syntactic methods for authorship attribution will improve them, specifically to address the difficult problem of authorship attribution of translated texts. Our results suggest (i) that frame-based classifiers are usable for author attribution of both translated and untranslated texts; (ii) that framebased classifiers generally perform worse than the baseline classifiers for untranslated texts, but (iii) perform as well as, or superior to the baseline classifiers on translated texts; (iv) that—contrary to current belief—naïve classifiers based on lexical markers may perform tolerably on translated texts if the combination of author and translator is present in the training set of a classifier.
منابع مشابه
Translation and Hybridity in Scenes and Frames Semantics
The present study is a theoretical attempt to illustrate how Fillmore's Scenes and Frames Semantics (SFS) could be employed as a framework to portray the process of understanding and translating hybrid texts. It first reviews the origin of SFS; then it maps SFS onto Nida’s linguistic model of translation process and the Interpretive Theory of Translation; it examines in the next section, withi...
متن کاملApplication of Frame Semantics to Teaching Seeing and Hearing Vocabulary to Iranian EFL Learners
A term in one language rarely has an absolute synonymous meaning in the same language; besides, it rarely has an equivalent meaning in an L2. English synonyms of seeing and hearing are particularly grammatically and semantically different. Frame semantics is a good tool for discovering differences between synonymous words in L2 and differences between supposed L1 and L2 equivalents. Vocabulary ...
متن کاملCross-Language Authorship Attribution
This paper presents a novel task of cross-language authorship attribution (CLAA), an extension of authorship attribution task to multilingual settings: given data labelled with authors in language X , the objective is to determine the author of a document written in language Y , where X 6= Y . We propose a number of cross-language stylometric features for the task of CLAA, such as those based o...
متن کاملDeep Level Lexical Features for Cross-lingual Authorship Attribution
Crosslingual document classification aims to classify documents written in different languages that share a common genre, topic or author. Knowledge-based methods and others based on machine translation deliver state-of-the-art classification accuracy, however because of their reliance on external resources, poorly resourced languages present a challenge for these type of methods. In this paper...
متن کاملRobust Word Sense Translation by EM Learning of Frame Semantics
We propose a robust method of automatically constructing a bilingual word sense dictionary from readily available monolingual ontologies by using estimation-maximization, without any annotated training data or manual tuning. We demonstrate our method on the English FrameNet and Chinese HowNet structures. Owing to the robustness of EM iterations in improving translation likelihoods, our word sen...
متن کامل